44 research outputs found

    TIDA: A spanish EHR semantic search engine

    Get PDF
    Electronic Health Records (EHR) and the constant adoption of Information Technologies in healthcare have dramatically increased the amount of unstructured data stored. The extraction of key information from this data will bring better caregivers decisions and an improvement in patients? treatments. With more than 495 million people talking Spanish, the need to adapt algorithms and technologies used in EHR knowledge extraction in English speaking countries, leads to the development of different frameworks. Thus, we present TIDA, a Spanish EHR semantic search engine, to give support to Spanish speaking medical centers and hospitals to convert pure raw data into information understandable for cognitive systems. This paper presents the results of TIDA?s Spanish EHR free-text treatment component with the adaptation of negation and context detection algorithms applied in a semantic search engine with a database with more than 30,000 clinical notes

    Tracking recurrent concepts using context

    Get PDF
    The problem of recurring concepts in data stream classification is a special case of concept drift where concepts may reappear. Although several existing methods are able to learn in the presence of concept drift, few consider contextual information when tracking recurring concepts. Nevertheless, in many real-world scenarios context information is available and can be exploited to improve existing approaches in the detection or even anticipation of recurring concepts. In this work, we propose the extension of existing approaches to deal with the problem of recurring concepts by reusing previously learned decision models in situations where concepts reappear. The different underlying concepts are identified using an existing drift detection method, based on the error-rate of the learning process. A method to associate context information and learned decision models is proposed to improve the adaptation to recurring concepts. The method also addresses the challenge of retrieving the most appropriate concept for a particular context. Finally, to deal with situations of memory scarcity, an intelligent strategy to discard models is proposed. The experiments conducted so far, using synthetic and real datasets, show promising results and make it possible to analyze the trade-off between the accuracy gains and the learned models storage cost

    Predicting recurring concepts on data-streams by me ans of a meta-model and a fuzzy similarity function

    Get PDF
    Meta-models can be used in the process of enhancing the drift detection mechanisms used by data stream algorithms, by representing and predicting when the change will occur. There are some real-world situations where a concept reappears, as in the case of intrusion detection systems(IDS), where the same incidents or an adaptation of them usually reappear over time. In these environments the early prediction of drift by means of a better knowledge of past models can help to anticipate to the change, thus improving efficiency of the model regarding the training instances needed. In this paper we present MM-PRec, a meta-model for predicting recurring concepts on data-streams which main goal is to predict when the drift is going to occur together with the best model to be used in case of a recurring concept. To fulfill this goal, MM-PRec trains a Hidden Markov Model (HMM) from the instances that appear during the concept drift. The learning process of the base classification learner feeds the meta-model with all the information needed to predict recurrent or similar situations. Thus, the models predicted together with the associated contextual information are stored. In our approach we also propose to use a fuzzy similarity function to decide which is the best model to represent a particular context when drift is detected. The experiments performed show that MM-PRec outperforms the behaviour of other context-aware algorithms in terms of training instances needed, specially in environments characterized by the presence of gradual drifts

    Automatic extraction and identification of users' responses in Facebook medical quizzes

    Get PDF
    In the last few years the use of social media in medicine has grown exponentially, providing a new area of research based on the analysis and use of Web 2.0 capabilities. In addition, the use of social media in medical education is a subject of particular interest which has been addressed in several studies. One example of this application is the medical quizzes of The New England Journal of Medicine (NEJM) that regularly publishes a set of questions through their Facebook timeline

    Collaborative data stream mining in ubiquitous environments using dynamic classifier selection

    Full text link
    In ubiquitous data stream mining applications, different devices often aim to learn concepts that are similar to some extent. In these applications, such as spam filtering or news recommendation, the data stream underlying concept (e.g., interesting mail/news) is likely to change over time. Therefore, the resultant model must be continuously adapted to such changes. This paper presents a novel Collaborative Data Stream Mining (Coll-Stream) approach that explores the similarities in the knowledge available from other devices to improve local classification accuracy. Coll-Stream integrates the community knowledge using an ensemble method where the classifiers are selected and weighted based on their local accuracy for different partitions of the feature space. We evaluate Coll-Stream classification accuracy in situations with concept drift, noise, partition granularity and concept similarity in relation to the local underlying concept. The experimental results show that Coll-Stream resultant model achieves stability and accuracy in a variety of situations using both synthetic and real world datasets

    Multi-agent location system in wireless networks

    Get PDF
    In this paper we propose a flexible Multi-Agent Architecture together with a methodology for indoor location which allows us to locate any mobile station (MS) such as a Laptop, Smartphone, Tablet or a robotic system in an indoor environment using wireless technology. Our technology is complementary to the GPS location finder as it allows us to locate a mobile system in a specific room on a specific floor using the Wi-Fi networks. The idea is that any MS will have an agent known at a Fuzzy Location Software Agent (FLSA) with a minimum capacity processing at its disposal which collects the power received at different Access Points distributed around the floor and establish its location on a plan of the floor of the building. In order to do so it will have to communicate with the Fuzzy Location Manager Software Agent (FLMSA). The FLMSAs are local agents that form part of the management infrastructure of the Wi-Fi network of the Organization. The FLMSA implements a location estimation methodology divided into three phases (measurement, calibration and estimation) for locating mobile stations (MS). Our solution is a fingerprint-based positioning system that overcomes the problem of the relative effect of doors and walls on signal strength and is independent of the network device manufacturer. In the measurement phase, our system collects received signal strength indicator (RSSI) measurements from multiple access points. In the calibration phase, our system uses these measurements in a normalization process to create a radio map, a database of RSS patterns. Unlike traditional radio map-based methods, our methodology normalizes RSS measurements collected at different locations on a floor. In the third phase, we use Fuzzy Controllers to locate an MS on the plan of the floor of a building. Experimental results demonstrate the accuracy of the proposed method. From these results it is clear that the system is highly likely to be able to locate an MS in a room or adjacent room

    A methodology to compare dimensionality reduction algorithms in terms of loss of quality

    Get PDF
    Dimensionality Reduction (DR) is attracting more attention these days as a result of the increasing need to handle huge amounts of data effectively. DR methods allow the number of initial features to be reduced considerably until a set of them is found that allows the original properties of the data to be kept. However, their use entails an inherent loss of quality that is likely to affect the understanding of the data, in terms of data analysis. This loss of quality could be determinant when selecting a DR method, because of the nature of each method. In this paper, we propose a methodology that allows different DR methods to be analyzed and compared as regards the loss of quality produced by them. This methodology makes use of the concept of preservation of geometry (quality assessment criteria) to assess the loss of quality. Experiments have been carried out by using the most well-known DR algorithms and quality assessment criteria, based on the literature. These experiments have been applied on 12 real-world datasets. Results obtained so far show that it is possible to establish a method to select the most appropriate DR method, in terms of minimum loss of quality. Experiments have also highlighted some interesting relationships between the quality assessment criteria. Finally, the methodology allows the appropriate choice of dimensionality for reducing data to be established, whilst giving rise to a minimum loss of quality

    Mining recurring concepts in a dynamic feature space

    Get PDF
    Most data stream classification techniques assume that the underlying feature space is static. However, in real-world applications the set of features and their relevance to the target concept may change over time. In addition, when the underlying concepts reappear, reusing previously learnt models can enhance the learning process in terms of accuracy and processing time at the expense of manageable memory consumption. In this paper, we propose mining recurring concepts in a dynamic feature space (MReC-DFS), a data stream classification system to address the challenges of learning recurring concepts in a dynamic feature space while simultaneously reducing the memory cost associated with storing past models. MReC-DFS is able to detect and adapt to concept changes using the performance of the learning process and contextual information. To handle recurring concepts, stored models are combined in a dynamically weighted ensemble. Incremental feature selection is performed to reduce the combined feature space. This contribution allows MReC-DFS to store only the features most relevant to the learnt concepts, which in turn increases the memory efficiency of the technique. In addition, an incremental feature selection method is proposed that dynamically determines the threshold between relevant and irrelevant features. Experimental results demonstrating the high accuracy of MReC-DFS compared with state-of-the-art techniques on a variety of real datasets are presented. The results also show the superior memory efficiency of MReC-DFS

    Minimal Decision Rules Based on the A Priori Algorithm

    Full text link
    Based on rough set theory many algorithms for rules extraction from data have been proposed. Decision rules can be obtained directly from a database. Some condition values may be unnecessary in a decision rule produced directly from the database. Such values can then be eliminated to create a more comprehensi- ble (minimal) rule. Most of the algorithms that have been proposed to calculate minimal rules are based on rough set theory or machine learning. In our ap- proach, in a post-processing stage, we apply the Apriori algorithm to reduce the decision rules obtained through rough sets. The set of dependencies thus obtained will help us discover irrelevant attribute values

    Mars: a personalised mobile activity recognition system

    Get PDF
    Mobile activity recognition focuses on inferring the current activities of a mobile user by leveraging the sensory data that is available on today’s smart phones. The state of the art in mobile activity recognition uses traditional classification learning techniques. Thus, the learning process typically involves: i) collection of labelled sensory data that is transferred and collated in a centralised repository; ii) model building where the classification model is trained and tested using the collected data; iii) a model deployment stage where the learnt model is deployed on-board a mobile device for identifying activities based on new sensory data. In this paper, we demonstrate the Mobile Activity Recognition System (MARS) where for the first time the model is built and continuously updated on-board the mobile device itself using data stream mining. The advantages of the on-board approach are that it allows model personalisation and increased privacy as the data is not sent to any external site. Furthermore, when the user or its activity profile changes MARS enables promptly adaptation. MARS has been implemented on the Android platform to demonstrate that it can achieve accurate mobile activity recognition. Moreover, we can show in practise that MARS quickly adapts to user profile changes while at the same time being scalable and efficient in terms of consumption of the device resources
    corecore